Computer Sciences Journal | A Comparative Analysis of Huffman Coding with Uniform Coding

The IUP Journal of Computer Sciences :

A Comparative Analysis of Huffman Coding with Uniform Coding

Article Details

Pub. Date	:	January, 2012
Product Name	:	The IUP Journal of Computer Sciences
Product Type	:	Article
Product Code	:	IJCS21201
Author Name	:	Pinaki Mitra
Availability	:	YES
Subject/Domain	:	Management
Download Format	:	PDF Format
No. of Pages	:	6

Price

For delivery in electronic format: Rs. 50; For delivery through courier (within India): Rs. 50 + Rs. 25 for Shipping & Handling Charges

Download

To download this Article click on the button below:

Abstract

In this paper, we analyze the compression ratio achieved by Huffman coding with that of uniform coding. For this, we first study the skewness property of the Huffman coding tree. We demonstrate that this tree will be completely skewed when the sorted frequency distribution of characters satisfies certain prefix properties. We also establish that among all the frequency distributions f1, f2, …, fn of a set of n characters that satisfy the prefix property, the average code length is maximum when the frequency distribution is a Fibonacci sequence. Then we estimate the average code length of Huffman coding for this frequency distribution.

Description

One of the most popular entropy encoding schemes for text data compression is Huffman coding. Huffman code is an optimal unique prefix code. In other words, given a set characters for which the frequency of occurrence of each character is known in advance, the maximum possible compression is achieved by Huffman coding. Also, no character code is a proper prefix of another character code. This makes the code unambiguous and uniquely decodable. Unlike uniform coding where we encode each of the n characters in the character set with equal size code length of [log n] bits, Huffman coding is a variable length coding. For more detailed information relating to Huffman coding refer to Cormen et al. (2006).

In this paper, we analyze the compression ratio achieved by Huffman coding with that of uniform coding. For this, we first study the skewness property of the Huffman coding tree. We demonstrate that this tree will be completely skewed when the sorted frequency distribution of characters satisfies certain prefix properties. We also establish that among all the frequency distributions f₁, f₂, …, f_n of a set of n characters that satisfy the prefix property, the average code length is maximum when the frequency distribution is a Fibonacci sequence. Then we estimate the average code length of Huffman coding for this frequency distribution.

The overall organization of the paper is as follows: in Section 2, we study the frequency distribution of characters for which the Huffman coding tree will be completely skewed. For this, we introduce the notion of prefix property of a sorted sequence. We also establish that for Fibonacci sequence, which also satisfies this prefix property, the average code length will be the maximum. In Section 3, we compute the average code length of Huffman code when the frequency distribution follows the Fibonacci sequence. Lastly, in Section 4, we conclude by summarizing our contribution.

Keywords

Computer Sciences Journal, Business Intelligence, Enterprise Systems, Enterprise Resource Planning, Customer Relationship Management, CRM, Business Operations Management, Business Process Mining, Finite State Machine, Transactional Information System, Genetic Algorithms, Decision Making Process, Data Mining Tools, Online Analysis Processing, OLAP, Artificial Intelligence.